Comments (9)
Hi @airmonitor , thank you for reporting this. Did you have a look at the associated sample we have for this construct ? https://github.com/aws-samples/generative-ai-cdk-constructs-samples/blob/main/samples/sagemaker_custom_endpoint/lib/sagemaker_custom_endpoint-stack.ts
from generative-ai-cdk-constructs.
Hello @krokoko
Thx for reaching out.
I will try with your suggestion.
Do you have also a working example for adding application autoscaling?
I need to have scalability added there so the number of endpoint instances will scale base on metric.
I've tried with application autoscaling but it was failing due to not working add_dependency method.
In short the endpoint creation was still in progress when application autoscaling tried to add required functionality.
AWS support added DependsOn section to the synthesised template but I was not able to implement the same using CDK due to missing add dependency method.
Regards
Tom
from generative-ai-cdk-constructs.
I've tested the proposed solution to use the IAM role that is created with CustomSageMakerEndpoint but apparently, I'm doing something wrong here
For the below scenario I'm getting errors from the cloudformation:
custom_sagemaker_endpoint = CustomSageMakerEndpoint(
self,
id=model_name,
container=ContainerImage.from_ecr_repository(repository=self.whisper_ecr_repo, tag=config_vars.image_tag),
enable_operational_metric=True,
endpoint_name=f"{config_vars.stage}-{config_vars.project}-{model_name}",
instance_count=1,
instance_type=SageMakerInstanceType.ML_G5_2_XLARGE # type: ignore
if config_vars.stage == "prod"
else SageMakerInstanceType.ML_G5_XLARGE, # type: ignore
model_data_url=f"s3://{self.model_data_s3_bucket.bucket_name}/model.tar.gz",
model_id=f"{config_vars.stage}-{config_vars.project}-{model_name}",
environment={"SAGEMAKER_TRITON_DEFAULT_MODEL_NAME": f"whisper-{model_name}"},
)
self.whisper_ecr_repo.grant_pull(custom_sagemaker_endpoint.role) # ECR Repository
self.model_data_s3_bucket.grant_read_write(custom_sagemaker_endpoint.role) # S3 bucket
Error from CloudFormation
9:56:57 AM | CREATE_FAILED | AWS::SageMaker::Model | devbackendwhisperl...odellargev2vanilla
Could not access model data at s3://s3-bucket/model.tar.gz. Please ensure that the role "arn:aws:iam::11223344:role/dev-backend-whisper-core--largev2vanilla
RoleC7CE1EE-xaSDM2XaG6fE" exists and that its trust relationship policy allows the action "sts:AssumeRole" for the service principal "sagemaker.amazonaws.com". Also ensure that the
role has "s3:GetObject" permissions and that the object is located in us-east-1. If your Model uses multiple models or uncompressed models, please ensure that the role has "s3:ListB
ucket" permission. (Service: AmazonSageMaker; Status Code: 400; Error Code: ValidationException; Request ID: 591d86b1-0e24-4d83-a7ec-f3f12b30402d; Proxy: null)
❌ backend-whisper-pipeline/dev-backend-whisper-core-stage/core-stack (dev-backend-whisper-core-stage-core-stack) failed: Error: The stack named dev-backend-whisper-core-stage-core-stack failed creation, it may need to be manually deleted from the AWS console: ROLLBACK_COMPLETE: Could not access model data at s3://s3-bucket/model.tar.gz. Please ensure that the role "arn:aws:iam::11223344:role/dev-backend-whisper-core--largev2vanillaRoleC7CE1EE-xaSDM2XaG6fE" exists and that its trust relationship policy allows the action "sts:AssumeRole" for the service principal "sagemaker.amazonaws.com". Also ensure that the role has "s3:GetObject" permissions and that the object is located in us-east-1. If your Model uses multiple models or uncompressed models, please ensure that the role has "s3:ListBucket" permission. (Service: AmazonSageMaker; Status Code: 400; Error Code: ValidationException; Request ID: 591d86b1-0e24-4d83-a7ec-f3f12b30402d; Proxy: null)
at FullCloudFormationDeployment.monitorDeployment (/opt/homebrew/lib/node_modules/aws-cdk/lib/index.js:435:10568)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Object.deployStack2 [as deployStack] (/opt/homebrew/lib/node_modules/aws-cdk/lib/index.js:438:199515)
at async /opt/homebrew/lib/node_modules/aws-cdk/lib/index.js:438:181237
❌ Deployment failed: Error: The stack named dev-backend-whisper-core-stage-core-stack failed creation, it may need to be manually deleted from the AWS console: ROLLBACK_COMPLETE: Could not access model data at s3://s3-bucket/model.tar.gz. Please ensure that the role "arn:aws:iam::11223344:role/dev-backend-whisper-core--largev2vanillaRoleC7CE1EE-xaSDM2XaG6fE" exists and that its trust relationship policy allows the action "sts:AssumeRole" for the service principal "sagemaker.amazonaws.com". Also ensure that the role has "s3:GetObject" permissions and that the object is located in us-east-1. If your Model uses multiple models or uncompressed models, please ensure that the role has "s3:ListBucket" permission. (Service: AmazonSageMaker; Status Code: 400; Error Code: ValidationException; Request ID: 591d86b1-0e24-4d83-a7ec-f3f12b30402d; Proxy: null)
at FullCloudFormationDeployment.monitorDeployment (/opt/homebrew/lib/node_modules/aws-cdk/lib/index.js:435:10568)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Object.deployStack2 [as deployStack] (/opt/homebrew/lib/node_modules/aws-cdk/lib/index.js:438:199515)
at async /opt/homebrew/lib/node_modules/aws-cdk/lib/index.js:438:181237
The stack named dev-backend-whisper-core-stage-core-stack failed creation, it may need to be manually deleted from the AWS console: ROLLBACK_COMPLETE: Could not access model data at s3://s3-bucket/model.tar.gz. Please ensure that the role "arn:aws:iam::11223344:role/dev-backend-whisper-core--largev2vanillaRoleC7CE1EE-xaSDM2XaG6fE" exists and that its trust relationship policy allows the action "sts:AssumeRole" for the service principal "sagemaker.amazonaws.com". Also ensure that the role has "s3:GetObject" permissions and that the object is located in us-east-1. If your Model uses multiple models or uncompressed models, please ensure that the role has "s3:ListBucket" permission. (Service: AmazonSageMaker; Status Code: 400; Error Code: ValidationException; Request ID: 591d86b1-0e24-4d83-a7ec-f3f12b30402d; Proxy: null)
from generative-ai-cdk-constructs.
Hi @airmonitor , did you provide your role with permissions to access the S3 bucket storing your artifacts ? We have this demonstrated in the sample here: https://github.com/aws-samples/generative-ai-cdk-constructs-samples/blob/main/samples/sagemaker_custom_endpoint/lib/sagemaker_custom_endpoint-stack.ts#L43C5-L57C7
from generative-ai-cdk-constructs.
Hey @krokoko.
I'm not sure if I get your intention here.
The example for extending IAM permissions is using non optimal way for granting necessary permissions
customEndpoint.addToRolePolicy(
new iam.PolicyStatement({
effect: iam.Effect.ALLOW,
actions: [
's3:GetObject',
's3:GetObject*',
's3:GetBucket*',
's3:List*',
],
resources: [
'BUCKET_ARN',
'BUCKET_ARN/*',
],
}),
);
In CDK we have a better way, shorter way to achieve the required results using the following examples
self.whisper_ecr_repo.grant_pull(custom_sagemaker_endpoint.role) # ECR Repository
self.model_data_s3_bucket.grant_read_write(custom_sagemaker_endpoint.role) # S3 bucket
grant_pull, grant_read_write method for ecr repository and s3 bucket should do the trick.
I'm not sure why we need to get back mentally a few years later when those methods weren't available in CDK?
Regards
Tom
from generative-ai-cdk-constructs.
Hi @airmonitor , sorry I just missed the last line of your previous example, I was thinking that permissions were missing on the role to access the model artifacts in S3.
Is the endpoint created in the same region where the bucket was created ?
from generative-ai-cdk-constructs.
Hello,
Yes, all resources are deployed in us-east-1
from generative-ai-cdk-constructs.
This issue is now marked as stale because it hasn't seen activity for a while. Add a comment or it will be closed soon. If you wish to exclude this issue from being marked as stale, add the "backlog" label.
from generative-ai-cdk-constructs.
I am not able to reproduce the issue. Tested both with the provided role by the construct, and by creating a role outside of the construct and passing it through constructor:
const roleSG = new iam.Role(this, 'Role', {
assumedBy: new iam.ServicePrincipal('sagemaker.amazonaws.com'),
managedPolicies: [
iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonSageMakerFullAccess'),
],
});
const endpoint = new genai.CustomSageMakerEndpoint(this, 'testllavaendpoint', {
...
role: roleSG
...
}
In both cases, the construct is successfully deployed.
One thing is for the samples we have and the tests I did, I am using the default bucket created by the SageMaker SDK (see https://sagemaker.readthedocs.io/en/stable/api/utility/session.html, default_bucket) when preparing the model. It might be worth on your end to create the same default session bucket and compare it with the bucket you provide to the construct, and see if any configuration difference stands out.
I will close the issue, if you get more details and/or can upload a sample stack to a public repo which we can use to consistently reproduce the issue, please reopen this ticket. Thank you !
from generative-ai-cdk-constructs.
Related Issues (20)
- (aws-rag-appsync-stepfn-opensearch): image ingestion failing HOT 2
- (aws-rag-appsync-stepfn-opensearch): Documentation unclear on embeddings model selection impact
- (aws-rag-appsync-stepfn-opensearch): CRUD operations on documents HOT 1
- agent: cannot disable user input for an agent HOT 6
- Observability: Add pre-built dashboards for basic infrastructure and model metrics HOT 4
- feat(models): (memory support) HOT 1
- (bedrock): implement prompt flows HOT 6
- (bedrock): s3-data-source - add hierarchical and semantic chunking support HOT 1
- (monitoring): cloudwatch dashboard to monitor AOSS
- (monitoring): cloudwatch dashboard for monitor sagemaker endpoints
- (bedrock): support additional data sources HOT 6
- (bedrock): guardrail contextual grounding support
- (bedrock): Knowledge base BedrockEmbeddingModelConfiguration HOT 1
- (documentation): code snippets for .Net and Go
- (bedrock): Add prompt and prompt management
- (bedrock): BedrockFoundationModel class implements IModel
- (OpenSearch ): Repeated CloudFormation while deleting OpenSearch VectorIndex
- Monthly issue metrics report HOT 1
- Monthly issue metrics report HOT 1
- (OpenSearch): Metadata Field property not changed to filterable false HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from generative-ai-cdk-constructs.