Giter VIP home page Giter VIP logo

aws-sagemaker-deploy's Introduction

⚠️ BentoCTL project has been deprecated

Plese see the latest BentoML documentation on OCI-container based deployment workflow: https://docs.bentoml.com/

AWS Sagemaker Operator

Sagemaker is a fully managed service for building ML models. BentoML provides great support for deploying BentoService to AWS Sagemaker without the additional process and work from users. With BentoML serving framework and bentoctl users can enjoy the performance and scalability of Sagemaker with any popular ML frameworks.

Note: This operator is compatible with BentoML version 1.0.0 and above. For older versions, please switch to the branch pre-v1.0 and follow the instructions in the README.md.

Table of Contents

Quickstart with bentoctl

This quickstart will walk you through deploying a bento as an AWS Sagemaker Endpoint. Make sure to go through the prerequisites section and follow the instructions to set everything up.

Prerequisites

  1. BentoML version 1.0 or above. Please follow the Installation guide.
  2. Terraform - Terraform is a tool for building, configuring, and managing infrastructure. Installation instruction: www.terraform.io/downloads
  3. AWS CLI - installed and configured with an AWS account with permission to Sagemaker, Lambda and ECR. Please follow the Installation guide.
  4. Docker - Install instruction: docs.docker.com/install
  5. A built Bento project. For this guide, we will use the Iris classifier bento from the BentoML quickstart guide. You can also use your own Bentos that are available locally.

Steps

  1. Install bentoctl via pip

    $ pip install bentoctl
    
  2. Install AWS Sagemaker operator

    Bentoctl will install the official AWS Sagemaker operator and its dependencies.

    $ bentoctl operator install aws-sagemaker
    
  3. Initialize deployment with bentoctl

    Follow the interactive guide to initialize the deployment project.

    $ bentoctl init
    
    Bentoctl Interactive Deployment Config Builder
    
    Welcome! You are now in interactive mode.
    
    This mode will help you setup the deployment_config.yaml file required for
    deployment. Fill out the appropriate values for the fields.
    
    (deployment config will be saved to: ./deployment_config.yaml)
    
    api_version: v1
    name: quickstart
    operator: aws-sagemaker
    template: terraform
    spec:
        region: ap-south-1
        instance_type: ml.t2.medium
        initial_instance_count: 1
        timeout: 60
        enable_data_capture: False
        destination_s3_uri:
        initial_sampling_percentage: 1
    filename for deployment_config [deployment_config.yaml]:
    deployment config generated to: deployment_config.yaml
    ✨ generated template files.
      - ./main.tf
      - ./bentoctl.tfvars

    This will also run the bentoctl generate command for you and will generate the main.tf terraform file, which specifies the resources to be created and the bentoctl.tfvars file which contains the values for the variables used in the main.tf file.

  4. Build and push AWS sagemaker compatible docker image to the registry

    Bentoctl will build and push the sagemaker compatible docker image to the AWS ECR repository.

    bentoctl build -b iris_classifier:latest -f deployment_config.yaml
    
    Step 1/22 : FROM bentoml/bento-server:1.0.0a6-python3.8-debian-runtime
     ---> 046bc2e28220
    Step 2/22 : ARG UID=1034
     ---> Using cache
     ---> f44cfa910c52
    Step 3/22 : ARG GID=1034
     ---> Using cache
     ---> e4d5aed007af
    Step 4/22 : RUN groupadd -g $GID -o bentoml && useradd -m -u $UID -g $GID -o -r bentoml
     ---> Using cache
     ---> fa8ddcfa15cf
    ...
    Step 22/22 : CMD ["bentoml", "serve", ".", "--production"]
     ---> Running in 28eccee2f650
     ---> 98bc66e49cd9
    Successfully built 98bc66e49cd9
    Successfully tagged quickstart:kiouq7wmi2gmockr
    🔨 Image build!
    Created the repository quickstart
    The push refers to repository
    [213386773652.dkr.ecr.ap-south-1.amazonaws.com/quickstart]
    kiouq7wmi2gmockr: digest:
    sha256:e1a468e6b9ceeed65b52d0ee2eac9e3cd1a57074eb94db9c263be60e4db98881 size: 3250
    63984d77b4da: Pushed
    2bc5eef20c91: Pushed
    ...
    da0af9cdde98: Layer already exists
    e5baccb54724: Layer already exists
    🚀 Image pushed!
    ✨ generated template files.
      - ./bentoctl.tfvars
      - ./startup_script.sh

    The iris-classifier service is now built and pushed into the container registry and the required terraform files have been created. Now we can use terraform to perform the deployment.

  5. Apply Deployment with Terraform

    1. Initialize terraform project. This installs the AWS provider and sets up the terraform folders.

      $ terraform init
    2. Apply terraform project to create Sagemaker deployment

      $ terraform apply -var-file=bentoctl.tfvars -auto-approve
      
      aws_iam_role.iam_role_lambda: Creating...
      aws_iam_role.iam_role_sagemaker: Creating...
      aws_apigatewayv2_api.lambda: Creating...
      aws_apigatewayv2_api.lambda: Creation complete after 1s [id=rwfej5qsf6]
      aws_cloudwatch_log_group.api_gw: Creating...
      aws_cloudwatch_log_group.api_gw: Creation complete after 1s [id=/aws/api_gw/quickstart-gw]
      aws_apigatewayv2_stage.lambda: Creating...
      aws_apigatewayv2_stage.lambda: Creation complete after 3s [id=$default]
      aws_iam_role.iam_role_sagemaker: Creation complete after 7s [id=quickstart-sagemaker-iam-role]
      aws_sagemaker_model.sagemaker_model: Creating...
      aws_iam_role.iam_role_lambda: Creation complete after 8s [id=quickstart-lambda-iam-role]
      aws_lambda_function.fn: Creating...
      ...
      
      
      Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
      
      Outputs:
      
      endpoint = "https://rwfej5qsf6.execute-api.ap-south-1.amazonaws.com/"
      ecr_image_tag = "213386773652.dkr.ecr.ap-south-1.amazonaws.com/quickstart:sfx3dagmpogmockr"
  6. Test deployed endpoint

    The iris_classifier uses the /classify endpoint for receiving requests so the full URL for the classifier will be in the form {EndpointUrl}/classify.

    URL=$(terraform output -json | jq -r .endpoint.value)classify
    curl -i \
      --header "Content-Type: application/json" \
      --request POST \
      --data '[5.1, 3.5, 1.4, 0.2]' \
      $URL
    
    HTTP/2 200
    date: Thu, 14 Apr 2022 23:02:45 GMT
    content-type: application/json
    content-length: 1
    apigw-requestid: Ql8zbicdSK4EM5g=
    
    0%

Note: You can also invoke the Sagemaker endpoint directly. If there is only one service, SageMaker deployment will choose that one. If there is more than one, you can specify which service to use by passing the X-Amzn-SageMaker-Custom-Attributes header with the name of the service as value.

  1. Delete deployment Use the bentoctl destroy command to remove the registry and the deployment

    bentoctl destroy -f deployment_config.yaml
    

Configuration Options

A sample configuration file has been given has been provided here. Feel free to copy it over and change it for you specific deployment values

  • region: AWS region where Sagemaker endpoint is deploying to
  • instance_type: The ML compute instance type for Sagemaker endpoint. See https://docs.aws.amazon.com/cli/latest/reference/sagemaker/create-endpoint-config.html for available instance types
  • initial_instance_count: Number of instances to launch initially.
  • timeout: timeout for API request in seconds
  • enable_data_capture: Enable Sagemaker capture data from requests and responses and store the captured data to AWS S3
  • destination_s3_uri: S3 bucket path for store captured data
  • initial_sampling_percentage: Percentage of the data will be captured to S3 bucket.

Troubleshooting

By default sagemaker is configured with cloudwatch for metrics and logs. To see the cloudwatch logs for the deployment

  1. Open the Amazon Cloudwatch console at https://console.aws.amazon.com/cloudwatch/.
  2. In the navigation pane, choose Logs -> Log groups.
  3. Head over to /aws/sagemaker/Endpoints/<deployment_name>-endpoint
  4. Choose the latest logs streams

aws-sagemaker-deploy's People

Contributors

jjmachan avatar parano avatar ssheng avatar sudohainguyen avatar yubozhao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

aws-sagemaker-deploy's Issues

SageMaker cannot hit the endpoint

Describe the bug

After deploying the dev endpoint (ref: #13), I cannot get the response with this:

curl -i \
    -X POST \
    -F image=@data/mobile-sample.png \
    https://123.execute-api.region.amazonaws.com/prod/predict

Looking at couldwatch: logs show this error:
image

To Reproduce

  1. Deploy an API on SageMaker that takes an image as input.
  2. Call the API by sending an Image

Expected behavior

Running the API by using bentoml server Model:latest works but is not reproducible for the deployed API on SageMaker.

Screenshots/Logs

Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/flask/app.py", line 2070, in wsgi_app
    response = self.full_dispatch_request()
  File "/opt/conda/lib/python3.8/site-packages/flask/app.py", line 1515, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/opt/conda/lib/python3.8/site-packages/flask/app.py", line 1513, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/conda/lib/python3.8/site-packages/flask/app.py", line 1499, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "/bento/wsgi.py", line 25, in view_function
    response = api.handle_request(req)
  File "/opt/conda/lib/python3.8/site-packages/bentoml/service/inference_api.py", line 294, in handle_request
    inf_task = self.input_adapter.from_http_request(request)
  File "/opt/conda/lib/python3.8/site-packages/bentoml/adapters/utils.py", line 129, in _method
    return method(self, req)
  File "/opt/conda/lib/python3.8/site-packages/bentoml/adapters/file_input.py", line 148, in from_http_request
    _, _, files = HTTPRequest.parse_form_data(req)
  File "/opt/conda/lib/python3.8/site-packages/bentoml/types.py", line 234, in parse_form_data
    stream, form, files = parse_form_data(environ, silent=False)
  File "/opt/conda/lib/python3.8/site-packages/werkzeug/formparser.py", line 126, in parse_form_data
    return FormDataParser(
  File "/opt/conda/lib/python3.8/site-packages/werkzeug/formparser.py", line 230, in parse_from_environ
    return self.parse(get_input_stream(environ), mimetype, content_length, options)
  File "/opt/conda/lib/python3.8/site-packages/werkzeug/formparser.py", line 265, in parse
    return parse_func(self, stream, mimetype, content_length, options)
  File "/opt/conda/lib/python3.8/site-packages/werkzeug/formparser.py", line 142, in wrapper
    return f(self, stream, *args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/werkzeug/formparser.py", line 290, in _parse_multipart
    raise ValueError("Missing boundary")

Environment:

  • OS: Debian 10
  • Python Version: 3.7.10
  • BentoML Version: 0.13.1
  • Using Docker Image: bentoml/model-server:0.13.1-py38-gpu

Additional context

Issue with SageMaker Deployment

Describe the bug

I am trying to deploy multiple versions (staging and production) of the same model on aws sagemaker using aws-sagemaker-deply tool. However after successfully deploying staging varient, I cannot deploy production version because of this error:
image

To Reproduce

BENTO_BUNDLE_PATH=$(bentoml get Model:latest --print-location -q)
python deploy.py $BENTO_BUNDLE_PATH model-stg sagemaker_config.json
python deploy.py $BENTO_BUNDLE_PATH model-prd sagemaker_config.json  # <- error here

Expected behavior

It should have had successfully deployed but gives this error:

Export with name OutputApiId is already exported by stack model-stg-endpoint. Rollback requested by user.

Screenshots/Logs

N/A

Environment:

  • OS: Debian GNU/Linux 10 (buster)
  • Python Version: Python 3.7.10 | packaged by conda-forge
  • BentoML Version: bentoml, version 0.13.1

Additional context

Image not build correctly in Github Actions

This issue originates from an issue in the Bentoctl repository. I later found out it was more related to the aws-sagemaker operator and I have therefore moved the issue.

When building the docker image with bentoctl build in Github Actions, the PATH environment variable is not set correctly. See screenshot below from docker image that has been build with Github Actions:
image

If i build the docker image locally instead, the PATH environment variable is set correctly. I.e. this problem only occurs when using Github Actions.

To fix it, I added the following line to the Dockerfile.template file: RUN echo \"export PATH=$%s:$%s\" >> /root/.bashrc' "BENTO_PATH" "PATH". A longer-term solution would be to find the root cause of this issue, and maybe rewrite the Dockerfile.Template file, such that it also works in Github Actions.

To reproduce:

Step 1 - package versions

bentoml==1.0.0a7
bentoctl==0.2.2
aws-sagemaker-deploy==v0.1.0

Step 2 - building the bento

Model is saved as a MLFlow model (scikit-learn).

I build the bento with the following bentofile bentoml build -f bentofile.yml

# bentofile.yml
service: "service.py:svc"
include:
 - "*.py" 
python:
  packages:
   - scikit-learn
   - pandas
   - mlflow
docker:
  distro: debian
  gpu: False
  python_version: "3.8.9"

Step 3 - containerize bento, i.e. build image with bentoctl and aws-sagemaker operator

I containerize with the following deployment_config.yml file bentoctl build -b model:latest -f deployment_config.yml --dry-run

# deployment_config.yml
api_version: v1
name: my-model
operator:
  name: aws-sagemaker
template: terraform
spec:
  region: eu-central-1
  instance_type: ml.t2.medium
  initial_instance_count: 1
  timeout: 60
  initial_sampling_percentage: 1

Support micro-batching in BentoML SageMaker deployment

Is your feature request related to a problem? Please describe.

Currently, BentoML deployment on SageMaker does not utilize the micro-batching capability: https://github.com/bentoml/BentoML/blob/master/bentoml/deployment/sagemaker/sagemaker_serve.py

Describe the solution you'd like
Needs investigation

Describe alternatives you've considered
SageMaker expects a WSGI app and we may need to add a WSGI wrapper to the bentoML Marshal Server.

Additional context
n/a

Exception in Bento's Flask usage when invoking Sagemaker endpoint

Hi there!

I just bundled and deployed a Pytorch image using the deploy.py file. Everything seems to deploy fine, but when I invoke the endpoint it crashed because BentoML is using a Flask call that appears not to exist.

I'm not sure if I've configured something wrong or if there's a version mismatch somewhere in the Bento + Docker stack.

Here is how I am invoking my endpoint:

aws sagemaker-runtime invoke-endpoint \
  --endpoint-name my-endpoint-name \
  --body '{"data":["foo"]}' \
  --cli-binary-format raw-in-base64-out \
  --content-type "application/json" \
  >(cat) 1>/dev/null | echo

In the logs, here is what that invocation looks like:

10.32.0.2 - - [25/Aug/2021:18:03:12 +0000] "POST /invocations HTTP/1.1" 500 290 "-" "AHC/2.0"

And then here is the exception that follows:

[2021-08-25 18:03:12,772] ERROR in app: Exception on /invocations [POST]

Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/opt/conda/lib/python3.8/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/opt/conda/lib/python3.8/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/opt/conda/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/opt/conda/lib/python3.8/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/conda/lib/python3.8/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/bento/wsgi.py", line 25, in view_function
    response = api.handle_request(req)
  File "/opt/conda/lib/python3.8/site-packages/bentoml/service/inference_api.py", line 295, in handle_request
    req = HTTPRequest.from_flask_request(request)
  File "/opt/conda/lib/python3.8/site-packages/bentoml/types.py", line 241, in from_flask_request
    tuple((k, v) for k, v in request.headers.items()), request.get_data(),

AttributeError: 'HTTPRequest' object has no attribute 'get_data'

incorrect Content-Type header casing in created lambda

When deploying with terraform:

terraform apply -var-file=bentoctl.tfvars -auto-approve

An aws lambda is created with the following code:
see source

        sagemaker_response = runtime.invoke_endpoint(
            EndpointName=\"newstory-demo-endpoint\",
            ContentType=safeget(event, 'headers', 'content-type', default='application/json'),
            CustomAttributes=safeget(event, 'path', default='')[1:],
            Body=b64decode(event.get('body')) if event.get('isBase64Encoded') else event.get('body')
        )

for every content type that is not 'application/json', this will not work, since the header name should be "Content-Type" and not 'content-type' (casing is different). (that is also the name that will be given to this header using a standard library, e.g python's requests lib
this should be fixed by normalizing the keys casing inside 'safeget' function or by changing the 'content-type' key to 'Content-Type'

GPU driver too old

Hi there,

I followed the tutorial in the README and tried to deploy on ml.g4dn.xlarge (to run GPU inference)

The cloud watch logs contain the following error

RuntimeError: The NVIDIA driver on your system is too old (found version 11040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.

Thank you in advance for your help!

Auto-scaling

Hi,
I want to ask if it is possible to enable Auto-scaling for SageMaker inference?
Thanks

Dockerfile doesn't load nginx or /health endpoint

Thanks for this project!

As best as I can tell, when I follow the instructions, the resulting docker ENTRYPOINT and CMD still seem to be the same as the a non-SageMaker container -- none of the nginx, wsgi, etc files are being loaded. Instead, it runs:

./docker-entrypoint.sh bentoml serve-gunicorn /bento

I'm fairly certain it is supposed to run ./serve instead so that NGINX loads.

The result is the sagemaker hangs forever waiting for the /health endpoint (which isn't there).

Here is the Dockerfile this code generates for me:

FROM bentoml/model-server:0.12.1-py38

# the env var $PORT is required by heroku container runtime
ENV PORT 8080
EXPOSE $PORT

RUN apt-get update --fix-missing &&     apt-get install -y nginx &&     apt-get clean

# gevent required by AWS Sagemaker
RUN pip install gevent>=20.9.0

# copy over model files
COPY . /bento
WORKDIR /bento

RUN if [ -f /bento/bentoml-init.sh ]; then bash -c /bento/bentoml-init.sh; fi

ENV PATH="/bento:$PATH"

Is there supposed to be code in there that starts the nginx service instead? (Or Yatai service?)

SageMaker Deployment is Broken in Quick-Start Guide

Hi guys,

Following the quick-start without any change, deployment to SageMaker seems to be failing. Even though I'm the admin for the account, when the deployment occurs, I'm getting an error about the role that BentoML uses (AWSServiceRoleForAmazonSageMakerNotebooks service-linked role) not having BatchGetImage permission for the ECR image that BentoML has pushed. This is how it looks like

Successfully created AWS Sagemaker deployment iris-classifier-dev
{
  "namespace": "dev",
  "name": "iris-classifier-dev",
  "spec": {
    "bentoName": "IrisClassifier",
    "bentoVersion": "4ddf04cf5ed8a4ed871b762340dd2fc3326e2a93",
    "operator": "AWS_SAGEMAKER",
    "sagemakerOperatorConfig": {
      "region": "us-east-1",
      "instanceType": "ml.m4.xlarge",
      "instanceCount": 1,
      "apiName": "predict",
      "timeout": 60,
      "dataCaptureSamplePercent": 100
    }
  },
  "state": {
    "state": "ERROR",
    "infoJson": {
      "EndpointName": "dev-iris-classifier-dev",
      "EndpointArn": "arn:aws:sagemaker:us-east-1:865087154132:endpoint/dev-iris-classifier-dev",
      "EndpointConfigName": "dev-iris-classif-IrisClassifier-4ddf04cf5ed8a4ed87",
      "EndpointStatus": "Failed",
      "FailureReason": " The role 'arn:aws:iam::865087154132:role/aws-service-role/sagemaker.amazonaws.com/AWSServiceRoleForAmazonSageMakerNotebooks' does not have BatchGetImage permission for the image: '865087154132.dkr.ecr.us-east-1.amazonaws.com/irisclassifier-sagemaker:4ddf04cf5ed8a4ed871b762340dd2fc3326e2a93'.",
      "CreationTime": "2021-04-28 10:09:19.287000+00:00",
      "LastModifiedTime": "2021-04-28 10:09:29.386000+00:00",
      "ResponseMetadata": {
        "RequestId": "24b95550-c21b-4e21-98c7-2bb90a32b5ff",
        "HTTPStatusCode": 200,
        "HTTPHeaders": {
          "x-amzn-requestid": "24b95550-c21b-4e21-98c7-2bb90a32b5ff",
          "content-type": "application/x-amz-json-1.1",
          "content-length": "612",
          "date": "Wed, 28 Apr 2021 10:09:29 GMT"
        },
        "RetryAttempts": 0
      }
    },
    "timestamp": "2021-04-28T10:09:30.479804Z"
  },
  "createdAt": "2021-04-28T10:06:52.550566Z",
  "lastUpdatedAt": "2021-04-28T10:06:52.550600Z"
}

I'm running the quickstart as, the only change being deploying to SageMaker instead of Lambda:

latest_version=$(bentoml get -q IrisClassifier:latest | jq -r '.bentoServiceMetadata.version')
bentoml sagemaker deploy iris-classifier-dev -b IrisClassifier:"$latest_version" --api-name predict

This happens both when I do this from GitHub Actions and from my local machine (macOS Big Sur), and I have tried this with a couple of different accounts (all of which I have admin IAM users for setup in my credentials).

Any ideas for a fix or workaround? Since AWSServiceRoleForAmazonSageMakerNotebooks is a service-linked-role, I cannot edit it or attach more policies, so the endpoint ends up not working.

Multiple model variants per endpoint

Is your feature request related to a problem? Please describe.
Is there a plan to support deploying multiple model variants in a single Sagameker Endpoint? (or at least, deploying two model variants per endpoint)

Describe the solution you'd like
As a minimum, A/B testing support would need a way to pass two BentoML bundles, build and deploy them both, and configure the InitialVariantWeight values for the production variants.

A more advanced solution could support deploying any number of variants.

Describe alternatives you've considered

In the case of A/B testing, it is better to write custom deployment code and not use BentoML at all.

It is possible to do it using BentoML right now, but the process is weird and error-prone:
I can deploy two endpoints using BentoML, open the AWS web console, create a new endpoint configuration using the models deployed with BentoML, deploy another endpoint with those two models, and remove the endpoints created by BentoML.

nginx.conf proxy_pass setting for deploying the service on AmazonSageMaker

Hello,

When I deploy the service to Amazon SageMaker, I get this error and the endpoint fails to create!

nginx: [emerg] "proxy_pass" cannot have URI part in location given by regular expression, or inside named location, or inside "if" statement, or inside "limit_except" block in /bento/nginx.conf:37

There is an issue in the default settings of nginx.conf file of aws-sagemaker-deploy tool....

worker_processes 1;
daemon off; # Prevent forking
pid /tmp/nginx.pid;
error_log /var/log/nginx/error.log;
events {
  # defaults
}
http {
  include /etc/nginx/mime.types;
  default_type application/octet-stream;
  access_log /var/log/nginx/access.log combined;
  upstream gunicorn {
    server unix:/tmp/gunicorn.sock;
  }
  server {
    listen 8080 deferred;
    client_max_body_size 500m;
    keepalive_timeout 5;
    proxy_read_timeout 1200s;
    location ~ ^/(ping|invocations) {
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header Host $http_host;
      proxy_redirect off;
      proxy_pass http://gunicorn;
    }
    location ~ /ping {
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header Host $http_host;
      proxy_redirect off;
      proxy_pass http://gunicorn/healthz;
    }
    location / {
      return 404 "{}";
    }
  }
}

Dockerfile: no space left on device

I have an error during the deployment, but could not find the solution. But I think I have enough space for deployment.

File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation)
docker.errors.APIError: 500 Server Error for http+docker://localhost/v1.40/build?t=.dkr.ecr.ap-northeast-1.amazonaws.com%2F-sagemaker-deployment-repo%&q=False&nocache=False&rm=False&forcerm=False&pull=False&dockerfile=Dockerfile: Internal Server Error ("Error processing tar file(exit status 1): write /FasttextEmbeddings/artifacts/model: no space left on device")
Traceback (most recent call last):
  File "aws-sagemaker-deploy/deploy.py", line 135, in <module>
    deploy(bento_bundle_path, deployment_name, config_json)
  File "aws-sagemaker-deploy/deploy.py", line 51, in deploy
    image_tag=image_tag,
  File "/home/ec2-user/SageMaker/aws-sagemaker-deploy/utils/__init__.py", line 98, in build_docker_image
    raise Exception(f"Failed to build docker image {image_tag}: {error}")
Exception: Failed to build docker image 906756132937.dkr.ecr.ap-northeast-1.amazonaws.com/fasttext-sagemaker-deployment-repo:abb: 500 Server Error for http+docker://localhost/v1.40/build?t=.dkr.ecr.ap-northeast-1.amazonaws.com%-sagemaker-deployment-repo%3Afasttextembeddings-20210929010627_014abb&q=False&nocache=False&rm=False&forcerm=False&pull=False&dockerfile=Dockerfile: Internal Server Error ("Error processing tar file(exit status 1): write /FasttextEmbeddings/artifacts/model: no space left on device")
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs         16G   76K   16G   1% /dev
tmpfs            16G     0   16G   0% /dev/shm
/dev/nvme0n1p1  109G   97G   12G  90% /
/dev/nvme1n1     89G   14G   71G  17% /home/ec2-user/SageMaker

Erro when try to create instace on aws-sagemaker "No module named 'sagemaker_service'"

Hi, hope some one may help pls:
Im getting this error when try to implement my bento on sagemaker:

2024-02-07T14:58:58+0000 [INFO] [cli] Service loaded from Bento directory: bentoml.Service(tag="tensorflow_fatiguelog:noxtfiwfzcm4fmg4", path="/home/bentoml/bento/")
2024-02-07T14:58:58+0000 [INFO] [cli] Environ for worker 0: set CPU thread count to 4
2024-02-07T14:58:58+0000 [INFO] [cli] Prometheus metrics for HTTP BentoServer from "sagemaker_service:svc" can be accessed at http://localhost:8080/metrics.
2024-02-07T14:58:58+0000 [INFO] [cli] Starting production HTTP BentoServer from "sagemaker_service:svc" listening on http://0.0.0.0:8080 (Press CTRL+C to quit)
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/bentoml/_internal/service/loader.py", line 151, in import_service
    module = importlib.import_module(module_name, package=working_dir)
  File "/usr/local/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1004, in _find_and_load_unlocked

ModuleNotFoundError: No module named 'sagemaker_service'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/bentoml/_internal/service/loader.py", line 409, in load
    svc = import_service(
  File "/usr/local/lib/python3.10/site-packages/simple_di/__init__.py", line 139, in _
    return func(*_inject_args(bind.args), **_inject_kwargs(bind.kwargs))
  File "/usr/local/lib/python3.10/site-packages/bentoml/_internal/service/loader.py", line 153, in import_service
    raise ImportServiceError(f'Failed to import module "{module_name}": {e}')
bentoml.exceptions.ImportServiceError: Failed to import module "sagemaker_service": No module named 'sagemaker_service'

my bentofile.yaml:

service: "service:svc"
# description: "file: ./README.md"
labels:
  type: anomaly-detector
  owner: ow-ds
  stage: demo
include:
  - "service.py"
python:
  packages:
    - tensorflow

models:
  - tensorflow_fatiguelog:latest

service.py

import numpy as np

import bentoml
from bentoml.io import NumpyNdarray

runner = bentoml.keras.get("tensorflow_fatiguelog:latest").to_runner()

svc = bentoml.Service(
    name="tensorflow_fatiguelog",
    runners=[runner],
)


@svc.api(
    input=NumpyNdarray(dtype="float32", shape=(1, 1, 5, 4, 4)),
    output=NumpyNdarray(dtype="float32", shape=(1, 1, 5, 4, 4)),
)
async def predict(f) -> "np.ndarray":
    return await runner.async_run(f)

deplyment file

api_version: v1
name: fatiguelog-anomaly-det
operator:
  name: aws-sagemaker
template: terraform
spec:
  region: us-east-1
  instance_type: ml.t2.xlarge
  initial_instance_count: 1
  timeout: 60
  initial_sampling_percentage: 1
env: {}

Getting same erro when follow this steps by steps tutorial :

https://www.youtube.com/watch?v=Zci_D4az9FU

code of this tutorial

https://github.com/jankrepl/mildlyoverfitted/tree/master/mini_tutorials/bentoml

I dont know if this error is generated by specific permision or another thing, hope this information help your to solve my problem. Thanks a lot!

Could not find files declared in bentofile.yaml include

Hi! I was able to deploy to AWS Sagemaker but model could not initialize (or be used) because it could not find the files that I specified in bentofile.yaml -> include -> "*.pyx"

service: "service:svc" # Same as the argument passed to `bentoml serve`
labels:
  owner: renz
  stage: prod
include:
  - "*.py" # A pattern for matching which files to include in the bento
  - "*.pyx"
python:
  packages: # Additional pip packages required by the service
    - boto3
    - ...

Everything works fine when using bentoml serve and docker run.

I think that the files are not copied when creating the deployable or it's in a different directory (?). I tried listing the files in directory during runtime (os.listdir('.')) but the files (*.pyx) aren't there, only these:

FILES IN DIR: ['sagemaker_service.py', 'serve', 'README.md', 'bento.yaml']

Link to sagemaker_config.json is broken

On main Github page of this project, in this section, link to sagemaker_config.json is broken: https://github.com/bentoml/aws-sagemaker-deploy/blob/main/sagemaker_config.json does not exist ("A sample configuration file has been given has been provided here").

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.