This project orchestrates the seamless deployment of a pre-trained GPT-2 model from the Hugging Face model hub onto Amazon SageMaker, enabling real-time inference. By leveraging AWS S3 for model storage and integrating the endpoint with AWS Lambda function and API Gateway, this deployment ensures efficient and scalable model serving.
![](https://private-user-images.githubusercontent.com/92028472/305195797-ac51219f-d2ba-40c3-a2b6-a3fb19cb3e54.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkyNzk5MzIsIm5iZiI6MTcxOTI3OTYzMiwicGF0aCI6Ii85MjAyODQ3Mi8zMDUxOTU3OTctYWM1MTIxOWYtZDJiYS00MGMzLWEyYjYtYTNmYjE5Y2IzZTU0LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjUlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjI1VDAxNDAzMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTEwNWYzZDY2MjA2NWY5ZmUxYmY4NzY0Yzg5M2NhYjMwZjhkMDQ0NDk0YjRhODFkZmRmNDhhNWIzZDhjNmZhYWImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.qu-iaEhtK-OTxjRgR_GLbJ1V_qMnHZoL97pVmCux-Jk)
The MLOps pipeline architecture illustrates the flow of activities involved in deploying and managing machine learning models. This comprehensive workflow encompasses various stages, including data preparation, model training, deployment, monitoring, and maintenance. Each stage plays a crucial role in ensuring the successful and efficient operation of machine learning systems.
data/
: Houses all project-related data.model/
: Stores the GPT-2 model weights.notebooks/
: Contains comprehensive experimentation notebooks.scripts/
: Hosts Python scripts for seamless local and remote execution.src/
: Organizes the model as a package along with associated modules.tests/
: Comprises a suite of tests to ensure model robustness.requirements.txt
: Lists all project dependencies for reproducibility.
-
Clone Repository:
git clone https://github.com/Alpha-131/MYM-assessment-task.git
-
Configure AWS Settings:
- Modify AWS configurations in relevant scripts to match project requirements.
-
Upload Model to S3:
python upload_to_s3.py
-
Deploy SageMaker Model:
python deploy_to_sagemaker.py
-
Setup Lambda Function:
- Integrate the endpoint with a Lambda function for streamlined processing.
-
API Gateway Configuration:
- Utilize API Gateway to create a production or testing stage, linking it with the Lambda function for seamless API access.
-
Access API URL:
- Access the API URL for making model inference requests:
https://<some_random_code>.execute-api.<region>.amazonaws.com/<stage_name>/<resource_name>
- Access the API URL for making model inference requests:
- Download GPT-2 model weights.
- Create model.tar.gz file for artifacts.
- Upload model.tar.gz to Amazon S3.
- Write deployment script for sagemaker endpoint.
- Establish CI/CD pipeline for automated deployment.
- Define YAML configuration file for CI/CD pipeline.
- Implement monitoring using AWS Cloudwatch for sagemaker endpoint.
- Implement logging using Cloudtrail and store it in S3 bucket for SageMaker endpoints.
- Configure autoscaling for dynamic traffic-based scaling.